Prueba Técnica - Wise Athena

Technical Test - Wise Athena

Accesos

Access

In case of using Google Colab

In case of running in local

Librerías

Libraries

In case of using Google Colab

In case of running in local

Lectura de csv

Read csv

Entendiendo los datos

Understanding Data

Questions about:

Bucamos Null y duplicados

Checking for NaN and duplicates

Sellin csv

Sellout_prov1

Sellout_prov2

Maestro_prod

Maestro_client

Merge DataFrames Sellout

Now, I am going to merge sellout dataframes (of both suppliers) with maestro_prod (description of the product) and maestro_client (description of the store)

Comparing the difference of rows to see if it is matched every item in the merge of both suppliers

Taking this into account, I compare if all the store_Id which have in products chart is in the list of stores chart.

So we have two options:
1. Talk with the client and tell the problem to know which stores are these.
2. Continue ignoring these stores.

In this case, second option is the choice because I can't talk to the client.

After this, I checked if in store's chart there are any Id duplicated.

Same Id different shops, we have three options:
1. Change the Store_Id for the correct Id, asking the client.
2. Change the Store_Id for other new Id.
3. Continue deleting the second one (235 in the index first supplier) (2422 in the index second supplier)

In this case, I choose deleting the second one because I can't talk to the client and If I put another Id this will continue generating an error because any product will not match with that Store_Id.

So now I concatenated the dataframes ignoring the problems.

Now, I did a checkup to know if all the sku that I have in store's description dataframe are in the description product dataframe.

Again we need to know this product, invent it or ignore it.
In this case, I'm going to ignore it because I can't know it asking the client and if I invent it this will return a non-real result.

Now, I compared if the merge was done well.

Now, I am going to change the order of the columns so the dataframe makes more sense.

Merge DataFrames Sellin

First, I'm going to compare if all the sku of sellin dataframe are in maestro_prod (description product dataframe)

In sellin dataframe I didn't have the information of supplier but I remembered that:

So, in this case, I merged by date ignoring the data from 2015 because I assumed that supplier 1 only bought in 2017 and supplier 2 only bought in 2016.

Now, I am going to change the order of the columns so the dataframe makes more sense.

Pandas Profiling

Questions about:

Which products are the best seller?

From manufacturer to supplier (sellin)

From supplier to final user (sellout)

Which products are the least seller?

Where are more products sold?

On what dates is it sold more?

From manufacturer to supplier (sellin)

From supplier to final user (sellout)

Which supplier do you sell the most to?

Document to Data Scientists